API Versioning and Contracts
The Surprising Part: A Single Field Rename Breaks Your Entire Client Fleet
# THE INCIDENT: Upload Service v1 returns this response
{
"doc_id": "doc-001",
"label": "invoice",
"created_at": "2024-01-15T10:30:00Z"
}
# A developer renames it for "consistency" with the user service:
{
"doc_id": "doc-001",
"label": "invoice",
"created_at": "2024-01-15T10:30:00Z"
}
# This change is deployed to production on a Friday at 3 PM.
# BLAST RADIUS - services that broke simultaneously:
# 1. Analytics Dashboard: reads user_name to display owner → now shows null
# 2. Notification Service: reads user_name for email subject → sends blank subject
# 3. Mobile App (iOS): cached user_name field → shows blank user everywhere
# 4. Mobile App (Android): same
# 5. Data Pipeline: ETL job reads user_name column → inserts nulls → reporting broken
# 6. Audit Logger: user_name in audit entries → blank entries → compliance issue
# Six consumers broken by one field rename.
# All consuming the same /v1/documents endpoint.
# None of them had any warning this was coming.
# The developer did not know any of these consumers existed.
# Root cause:
# - No API versioning (no way to deploy a breaking change alongside the old version)
# - No contract tests (no automated check that this change would break consumers)
# - No backward compatibility strategy (no dual-field support during migration)
# This lesson is about preventing every aspect of this incident.
What You Will Learn
- Why Versioning Matters - calculating blast radius, the real cost of breaking changes
- Versioning Strategies - URL path, header, query parameter - pros/cons and recommendations
- URL Versioning in FastAPI -
APIRouterwith prefix, version router factory, sharing code - OpenAPI Schema Evolution - additive vs breaking changes, compatibility rules
- Contract Testing with Pact - consumer-driven contracts,
pact-python, provider verification - Backward Compatibility - Tolerant Reader, Postel's Law, dual field support with Pydantic
- API Deprecation -
Deprecatedheaders, sunset dates, migration guides - Client SDK Generation - generating a typed Python client from OpenAPI spec
Prerequisites: FastAPI basics (Lesson 01), Pydantic v2, pytest.
Part 1: Why Versioning Matters - Calculating Blast Radius
The Breaking Change Categories
Not all changes are breaking. Understanding which changes are safe is the foundation of API evolution.
| Change | Breaking? | Example |
|---|---|---|
| Add optional field to response | No | {"new_field": null} - old clients ignore it |
| Add optional field to request | No | Old clients don't send it; server uses default |
| Remove field from response | YES | Clients reading that field now get null or KeyError |
| Rename field in response | YES | user_name → username - all consumers break |
| Change field type | YES | "count": "5" → "count": 5 - type errors in typed clients |
| Change field from optional to required in request | YES | Clients not sending it get 422 errors |
| Add required field to request | YES | Same |
| Change HTTP status code | YES | Clients checking exact codes break |
| Change URL path | YES | All bookmarks and hardcoded URLs break |
| Add enum value to response | Potentially | Clients with exhaustive enum checks fail |
| Change authentication scheme | YES | All clients need updates |
Estimating Blast Radius Before Deployment
# blast_radius_estimator.py
# Query your API access logs to find consumers of an endpoint
import re
from collections import Counter
def estimate_blast_radius(
log_file: str,
endpoint_pattern: str,
field_name: str,
last_days: int = 30,
) -> dict:
"""
Parse nginx/uvicorn access logs to find:
- How many unique clients hit this endpoint
- How often
- Which User-Agent strings (to identify client types)
"""
hits_by_client = Counter()
user_agents = set()
endpoint_re = re.compile(endpoint_pattern)
with open(log_file) as f:
for line in f:
if not endpoint_re.search(line):
continue
# Extract client IP and User-Agent from combined log format
parts = line.split('"')
if len(parts) >= 6:
client_ip = line.split()[0]
user_agent = parts[5] if len(parts) > 5 else "unknown"
hits_by_client[client_ip] += 1
user_agents.add(user_agent[:100])
return {
"endpoint_pattern": endpoint_pattern,
"field_being_changed": field_name,
"unique_clients_last_30_days": len(hits_by_client),
"total_requests_last_30_days": sum(hits_by_client.values()),
"top_clients": hits_by_client.most_common(10),
"unique_user_agents": len(user_agents),
"estimated_services_affected": len([ua for ua in user_agents if "python" in ua.lower()]),
"recommendation": (
"HIGH RISK - coordinate with all consumers before deploying"
if len(hits_by_client) > 5
else "LOW RISK - few consumers, coordinate directly"
),
}
# result = estimate_blast_radius(
# log_file="/var/log/nginx/access.log",
# endpoint_pattern=r"GET /api/v1/documents",
# field_name="user_name",
# )
# {'unique_clients_last_30_days': 23, 'recommendation': 'HIGH RISK - ...'}
Part 2: Versioning Strategies
The Three Strategies
# Strategy 1: URL Path Versioning
# GET /v1/documents/doc-001
# GET /v2/documents/doc-001
# Strategy 2: Request Header Versioning
# GET /documents/doc-001
# Accept: application/vnd.docservice+json; version=2
# or
# API-Version: 2
# Strategy 3: Query Parameter Versioning
# GET /documents/doc-001?version=2
# GET /documents/doc-001?api-version=2024-01-01 (Azure style - date-based)
Comparison Table
| Factor | URL Path | Header | Query Parameter |
|---|---|---|---|
| Discoverability | High - visible in URLs | Low - must know header | Medium |
| Cacheability | Excellent - different URLs cache independently | Poor - cache must Vary on header | Good |
| Browser/curl friendliness | Excellent | Requires custom header | Good |
| REST purity | Debated (resource URI should identify resource, not version) | More REST-correct | Also debated |
| Default in industry | Most common | GitHub, Stripe, AWS use this | Azure, some internal APIs |
| Documentation clarity | Excellent - OpenAPI paths obvious | Requires header documentation | Good |
| Load balancer routing | Trivial (route on path prefix) | Requires header inspection | Medium |
Recommendation by use case:
| Use Case | Recommended Strategy |
|---|---|
| Public REST API for external developers | URL path (/v1/, /v2/) - most familiar |
| Internal microservice APIs | URL path or header - team preference |
| Experimental/preview features | Query parameter (?preview=true) |
| High-traffic services needing good caching | URL path |
| APIs where URL stability is a requirement | Header versioning |
This lesson uses URL path versioning - it is the most widely used, easiest to reason about, and easiest to deploy with zero client changes (old paths still work).
Part 3: URL Versioning in FastAPI
Basic Router Versioning
# upload_service/api/v1/documents.py
from fastapi import APIRouter, Depends
from upload_service.api.v1.schemas import DocumentResponseV1
from upload_service.dependencies import get_current_user, get_db_session
router_v1 = APIRouter(prefix="/v1/documents", tags=["Documents v1"])
@router_v1.get("/{doc_id}", response_model=DocumentResponseV1)
async def get_document_v1(
doc_id: str,
current_user: dict = Depends(get_current_user),
db = Depends(get_db_session),
):
doc = await fetch_document(doc_id, db)
return DocumentResponseV1(
doc_id=doc.id,
user_name=doc.owner_email, # v1 uses "user_name" (old name)
label=doc.label,
created_at=doc.created_at,
)
# upload_service/api/v2/documents.py
from upload_service.api.v2.schemas import DocumentResponseV2
router_v2 = APIRouter(prefix="/v2/documents", tags=["Documents v2"])
@router_v2.get("/{doc_id}", response_model=DocumentResponseV2)
async def get_document_v2(
doc_id: str,
current_user: dict = Depends(get_current_user),
db = Depends(get_db_session),
):
doc = await fetch_document(doc_id, db)
return DocumentResponseV2(
doc_id=doc.id,
username=doc.owner_email, # v2 uses "username" (new name)
owner_id=doc.owner_id, # v2 adds owner_id
label=doc.label,
confidence=doc.confidence, # v2 adds confidence
created_at=doc.created_at,
)
# upload_service/main.py
app = FastAPI(title="Upload Service")
app.include_router(router_v1)
app.include_router(router_v2)
Version Router Factory - DRY Pattern for Many Versions
# upload_service/api/versioning.py
from fastapi import APIRouter
from typing import Callable
class VersionedRouter:
"""
Creates versioned routers that share common dependencies and middleware
but have version-specific route implementations.
"""
def __init__(self, resource_name: str, tags: list[str]):
self._resource = resource_name
self._tags = tags
self._versions: dict[int, APIRouter] = {}
def version(self, v: int) -> APIRouter:
"""Get or create a router for a specific version."""
if v not in self._versions:
self._versions[v] = APIRouter(
prefix=f"/v{v}/{self._resource}",
tags=[f"{tag} v{v}" for tag in self._tags],
)
return self._versions[v]
def all_routers(self) -> list[APIRouter]:
return list(self._versions.values())
# Usage:
document_router = VersionedRouter("documents", ["Documents"])
@document_router.version(1).get("/{doc_id}")
async def get_document_v1(doc_id: str):
...
@document_router.version(2).get("/{doc_id}")
async def get_document_v2(doc_id: str):
...
for router in document_router.all_routers():
app.include_router(router)
Sharing Code Between Versions
The key to sustainable versioning: business logic is shared, only the serialisation/schema differs.
# upload_service/services/document_service.py - VERSION-AGNOSTIC business logic
class DocumentService:
"""Pure business logic. No knowledge of API versions."""
def __init__(self, db: AsyncSession, storage: StorageClient):
self._db = db
self._storage = storage
async def get_document(self, doc_id: str, user_id: str) -> DocumentDomain:
"""Returns a domain model - not a schema. Versions adapt this."""
doc = await self._db.get(DocumentORM, doc_id)
if not doc or doc.owner_id != user_id:
raise DocumentNotFoundError(doc_id)
return DocumentDomain(
id=doc.id,
owner_id=doc.owner_id,
owner_email=doc.owner_email,
label=doc.label,
confidence=doc.confidence,
created_at=doc.created_at,
storage_key=doc.storage_key,
)
# upload_service/api/v1/schemas.py - V1 schema maps the domain model
class DocumentResponseV1(BaseModel):
doc_id: str
user_name: str # Old field name
label: str | None
created_at: datetime
@classmethod
def from_domain(cls, doc: DocumentDomain) -> "DocumentResponseV1":
return cls(
doc_id=doc.id,
user_name=doc.owner_email, # Map from owner_email
label=doc.label,
created_at=doc.created_at,
)
# upload_service/api/v2/schemas.py - V2 schema maps the same domain model differently
class DocumentResponseV2(BaseModel):
doc_id: str
username: str # New field name
owner_id: str # New field
label: str | None
confidence: float | None # New field
created_at: datetime
@classmethod
def from_domain(cls, doc: DocumentDomain) -> "DocumentResponseV2":
return cls(
doc_id=doc.id,
username=doc.owner_email,
owner_id=doc.owner_id,
label=doc.label,
confidence=doc.confidence,
created_at=doc.created_at,
)
# Routes are thin - they call the service and adapt the result
@router_v1.get("/{doc_id}", response_model=DocumentResponseV1)
async def get_document_v1(
doc_id: str,
current_user: dict = Depends(get_current_user),
service: DocumentService = Depends(get_document_service),
):
doc = await service.get_document(doc_id, current_user["id"])
return DocumentResponseV1.from_domain(doc)
@router_v2.get("/{doc_id}", response_model=DocumentResponseV2)
async def get_document_v2(
doc_id: str,
current_user: dict = Depends(get_current_user),
service: DocumentService = Depends(get_document_service),
):
doc = await service.get_document(doc_id, current_user["id"])
return DocumentResponseV2.from_domain(doc)
Sunsetting Old Versions
# upload_service/middleware/deprecation.py
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response
from datetime import date
DEPRECATED_VERSIONS = {
"v1": {
"sunset_date": "2025-06-01",
"migration_guide": "https://docs.example.com/api/migrate-v1-to-v2",
"successor": "v2",
},
}
class DeprecationMiddleware(BaseHTTPMiddleware):
"""Adds deprecation warnings to responses for old API versions."""
async def dispatch(self, request: Request, call_next) -> Response:
response = await call_next(request)
# Check if this request targets a deprecated version
path = request.url.path
for version, info in DEPRECATED_VERSIONS.items():
if f"/{version}/" in path:
sunset_date = info["sunset_date"]
migration_url = info["migration_guide"]
successor = info["successor"]
# RFC 8594 Sunset header - standard way to communicate deprecation
response.headers["Sunset"] = sunset_date
response.headers["Deprecation"] = "true"
response.headers["Link"] = (
f'<{migration_url}>; rel="deprecation", '
f'<{request.url.path.replace(version, successor)}>; rel="successor-version"'
)
response.headers["Warning"] = (
f'299 - "This API version ({version}) is deprecated and will be '
f'removed on {sunset_date}. Migrate to {successor}: {migration_url}"'
)
break
return response
Part 4: OpenAPI Schema Evolution
The Rules of Backward Compatible Schema Changes
# SAFE (additive) changes - old clients continue to work:
# 1. Add an optional field to a response
class DocumentResponseV1(BaseModel):
doc_id: str
label: str | None
# SAFE: Adding an optional field - old clients ignore it
confidence: float | None = None # NEW - optional, has default
# 2. Add an optional field to a request (with a default)
class ClassifyRequestV1(BaseModel):
text: str
document_id: str
model_version: str | None = None # NEW - optional, old clients don't send it
# UNSAFE (breaking) changes:
# 3. Remove a field from a response - BREAKING
class DocumentResponseV2_WRONG(BaseModel):
doc_id: str
# label: str | None ← REMOVED: old clients reading label.get("label") get None/KeyError
# 4. Change a field type - BREAKING
class DocumentResponseV2_WRONG(BaseModel):
doc_id: str
label: str | None
created_at: int # Changed from datetime to Unix timestamp int - BREAKING
# 5. Make an optional field required in a request - BREAKING
class ClassifyRequestV2_WRONG(BaseModel):
text: str
document_id: str
model_version: str # Changed from optional to required - old clients break
# CORRECT approach: version the schema
class ClassifyRequestV2(BaseModel):
text: str
document_id: str
model_version: str = "latest" # Has a sensible default - backward compatible
# Using Pydantic model inheritance for versioned schemas
class DocumentResponseBase(BaseModel):
"""Fields shared across all versions."""
doc_id: str
label: str | None = None
created_at: datetime
class DocumentResponseV1(DocumentResponseBase):
user_name: str # v1 name
class DocumentResponseV2(DocumentResponseBase):
username: str # v2 name
owner_id: str
confidence: float | None = None
class DocumentResponseV3(DocumentResponseV2):
"""v3 adds metadata - backward compatible addition."""
page_count: int | None = None
language: str | None = None
Part 5: Contract Testing with Pact
What is Consumer-Driven Contract Testing?
WITHOUT contract testing:
- Upload Service (Consumer) calls Classification Service (Provider)
- Consumer assumes {"label": "invoice", "confidence": 0.97}
- Provider changes response to {"document_label": "invoice", "score": 0.97}
- Provider's unit tests pass (they test their own code)
- Consumer's unit tests pass (they mock the provider)
- Integration test environment has a different version - mismatch not caught
- Broken in PRODUCTION
WITH Pact contract testing:
- Consumer writes a test that specifies exactly what request/response it expects
- Pact generates a "contract" file (JSON pact file)
- Provider downloads the contract and verifies it can satisfy every interaction
- If provider changes the response shape, contract verification FAILS
- Broken before it reaches production - in CI
Consumer Test (Upload Service)
# tests/contract/test_classification_consumer.py
# Upload Service is the CONSUMER - it calls Classification Service
import pytest
from pact import Consumer, Provider, Like, EachLike, Term
import httpx
import asyncio
# Define what Upload Service expects from Classification Service
pact = Consumer("upload-service").has_pact_with(
Provider("classification-service"),
host_name="localhost",
port=1234, # Pact starts a mock server on this port during the test
pact_dir="./pacts", # Where to save the generated pact file
log_dir="./logs",
)
class TestClassificationServiceContract:
def test_classify_single_document(self):
"""
Consumer test: define what Upload Service expects when it calls
POST /classify with a valid document.
"""
expected_request = {
"text": "This invoice is due on January 15th. Total amount: $1,200.00.",
"doc_id": "doc-test-001",
"max_results": 3,
}
# Use Pact matchers - Like() means "same type, not exact value"
# This makes the contract flexible: the exact confidence value can differ
# as long as it is a float. This is important - we don't want brittle contracts.
expected_response = {
"document_id": Like("doc-test-001"),
"model_version": Like("v3.2.1"),
"processing_time_ms": Like(150),
"results": EachLike({
"document_type": Like("DOCUMENT_TYPE_INVOICE"),
"confidence": Like(0.97),
"keywords": EachLike("invoice"),
}),
}
# Set up the interaction (what we will call and what we expect back)
(pact
.given("a valid document text is provided")
.upon_receiving("a classification request")
.with_request(
method="POST",
path="/classify",
headers={"Content-Type": "application/json"},
body=expected_request,
)
.will_respond_with(
status=200,
headers={"Content-Type": "application/json"},
body=expected_response,
))
with pact:
# Call the REAL consumer code against the MOCK provider
# This verifies that the consumer code correctly interprets the response
result = asyncio.get_event_loop().run_until_complete(
call_classification_service_under_test(
text=expected_request["text"],
doc_id=expected_request["doc_id"],
base_url=f"http://localhost:1234",
)
)
# Verify that our consumer code correctly processed the response
assert result["label"] == "DOCUMENT_TYPE_INVOICE"
assert result["confidence"] > 0.0
# Pact automatically saved the contract to ./pacts/upload-service-classification-service.json
def test_classify_with_invalid_text_returns_422(self):
"""Consumer test: what happens when we send empty text."""
(pact
.given("empty text is provided")
.upon_receiving("a classification request with empty text")
.with_request(
method="POST",
path="/classify",
body={"text": "", "doc_id": "doc-test-002", "max_results": 3},
)
.will_respond_with(
status=422,
body={
"error_code": Like("INVALID_ARGUMENT"),
"message": Like("text field is required"),
},
))
with pact:
with pytest.raises(ValueError, match="Invalid classification request"):
asyncio.get_event_loop().run_until_complete(
call_classification_service_under_test(
text="",
doc_id="doc-test-002",
base_url=f"http://localhost:1234",
)
)
async def call_classification_service_under_test(
text: str, doc_id: str, base_url: str
) -> dict:
"""The actual consumer code being tested against the mock."""
async with httpx.AsyncClient(base_url=base_url) as client:
response = await client.post(
"/classify",
json={"text": text, "doc_id": doc_id, "max_results": 3},
)
if response.status_code == 422:
raise ValueError(f"Invalid classification request: {response.json().get('message')}")
response.raise_for_status()
data = response.json()
# Consumer maps the response to its internal format
results = data.get("results", [])
if not results:
return {"label": "unknown", "confidence": 0.0}
top = results[0]
return {
"label": top["document_type"],
"confidence": top["confidence"],
"model_version": data["model_version"],
}
The Generated Pact File
{
"consumer": {"name": "upload-service"},
"provider": {"name": "classification-service"},
"interactions": [
{
"description": "a classification request",
"providerState": "a valid document text is provided",
"request": {
"method": "POST",
"path": "/classify",
"headers": {"Content-Type": "application/json"},
"body": {
"text": "This invoice is due on January 15th. Total amount: $1,200.00.",
"doc_id": "doc-test-001",
"max_results": 3
}
},
"response": {
"status": 200,
"headers": {"Content-Type": "application/json"},
"body": {
"document_id": "doc-test-001",
"model_version": "v3.2.1",
"processing_time_ms": 150,
"results": [
{
"document_type": "DOCUMENT_TYPE_INVOICE",
"confidence": 0.97,
"keywords": ["invoice"]
}
]
},
"matchingRules": {
"$.body.document_id": {"match": "type"},
"$.body.model_version": {"match": "type"},
"$.body.processing_time_ms": {"match": "type"},
"$.body.results[*].document_type": {"match": "type"},
"$.body.results[*].confidence": {"match": "type"},
"$.body.results[*].keywords[*]": {"match": "type"}
}
}
}
],
"metadata": {"pactSpecification": {"version": "2.0.0"}}
}
Provider Verification (Classification Service)
# tests/contract/test_classification_provider.py
# Classification Service verifies it satisfies the consumer's contracts
import pytest
from pact import Verifier
import asyncio
from contextlib import asynccontextmanager
# Provider states - set up test data for each "given" clause in the contract
provider_states = {}
def register_state(state: str):
def decorator(func):
provider_states[state] = func
return func
return decorator
@register_state("a valid document text is provided")
async def setup_valid_document():
"""Ensure the ML model is loaded and ready."""
# In a test environment, this might load a small test model
pass # Model is loaded in lifespan
@register_state("empty text is provided")
async def setup_empty_text():
"""No setup needed - we're testing validation."""
pass
# FastAPI app with a provider state endpoint for Pact to call
from fastapi import FastAPI, Request
state_app = FastAPI()
@state_app.post("/_pact/provider-states")
async def set_provider_state(request: Request):
body = await request.json()
state = body.get("state")
if state in provider_states:
await provider_states[state]()
return {"state": state, "status": "set"}
class TestClassificationServiceProvider:
def test_satisfies_upload_service_contract(self):
"""
Verify that Classification Service satisfies all interactions
defined by Upload Service's contract.
"""
verifier = Verifier(
provider="classification-service",
provider_base_url="http://localhost:8002", # Classification Service test server
)
output, _ = verifier.verify_with_broker(
broker_url="http://pact-broker:9292",
broker_username="admin",
broker_password="secret",
consumer_version_selectors=[
{"mainBranch": True},
{"deployedOrReleased": True},
],
provider_states_setup_url="http://localhost:8002/_pact/provider-states",
publish_verification_results=True,
provider_version="1.2.0",
provider_version_branch="main",
)
assert output == 0, "Provider contract verification failed"
def test_satisfies_local_pact_file(self):
"""For local development - verify against locally generated pact."""
verifier = Verifier(
provider="classification-service",
provider_base_url="http://localhost:8002",
)
output, _ = verifier.verify_pacts(
pact_urls=["./pacts/upload-service-classification-service.json"],
provider_states_setup_url="http://localhost:8002/_pact/provider-states",
)
assert output == 0, "Provider did not satisfy contract"
Part 6: Backward Compatibility Strategies
The Tolerant Reader Pattern
A robust consumer ignores fields it does not understand, rather than failing on unexpected fields.
# upload_service/clients/classification_client.py
# BRITTLE consumer - fails on unexpected fields (TypedDict or strict Pydantic)
from typing import TypedDict
class StrictClassificationResponse(TypedDict):
document_id: str
label: str
confidence: float
# If Classification Service adds "model_version" → KeyError if you try to unpack
# TOLERANT consumer - Pydantic with extra="ignore" (the default)
from pydantic import BaseModel
class TolerantClassificationResponse(BaseModel):
document_id: str
label: str
confidence: float
model_version: str | None = None # Optional - handles both old and new provider
model_config = {"extra": "ignore"} # Unknown fields are silently ignored
# Parse the response tolerantly
response_data = await http_client.get(f"/classify/{doc_id}")
parsed = TolerantClassificationResponse.model_validate(response_data.json())
# Works whether provider sends model_version or not
Postel's Law: Be Conservative in What You Send, Liberal in What You Accept
# Being liberal in what we accept - supporting both old and new field names during migration
from pydantic import BaseModel, model_validator
from typing import Any
class DocumentRequest(BaseModel):
"""
Accepts both the old field name (user_name) and the new field name (username).
This allows a gradual migration: old clients keep working while new clients
can use the new field name.
"""
doc_id: str
username: str | None = None
user_name: str | None = None # Deprecated field - kept for backward compatibility
text: str
@model_validator(mode="before")
@classmethod
def normalise_user_field(cls, data: Any) -> Any:
"""Accept either user_name (old) or username (new), normalise to username."""
if isinstance(data, dict):
# If old field is present but new field is not, migrate it
if "user_name" in data and "username" not in data:
data["username"] = data["user_name"]
# Always keep user_name for the duration of the migration period
return data
@property
def resolved_username(self) -> str:
"""Always returns a value - handles both field names."""
return self.username or self.user_name or ""
# Test both field names work
Dual-Field Response During Migration
# During migration, return BOTH field names in the response
# Old clients read user_name, new clients read username
# After all clients have migrated, remove user_name
class DocumentResponseMigration(BaseModel):
doc_id: str
username: str # New canonical name
user_name: str # Deprecated alias - same value as username
@model_validator(mode="before")
@classmethod
def sync_user_fields(cls, data: Any) -> Any:
"""Ensure both field names always have the same value."""
if isinstance(data, dict):
if "username" in data and "user_name" not in data:
data["user_name"] = data["username"]
elif "user_name" in data and "username" not in data:
data["username"] = data["user_name"]
return data
# Response payload:
# {
# "doc_id": "doc-001",
# "username": "[email protected]", ← new clients use this
# "user_name": "[email protected]", ← old clients use this
# "label": "invoice"
# }
Part 7: API Deprecation
Deprecation Response Headers
# upload_service/routes/documents_v1.py
from fastapi import APIRouter, Response
from datetime import datetime, timezone
router_v1 = APIRouter(prefix="/v1/documents", tags=["Documents v1 (Deprecated)"])
SUNSET_DATE = "2025-06-01T00:00:00Z"
MIGRATION_GUIDE = "https://docs.example.com/api/v1-to-v2-migration"
def add_deprecation_headers(response: Response, version: str = "v1") -> None:
"""Add standard deprecation headers to a response."""
# RFC 8594: Sunset header - when the API will be removed
response.headers["Sunset"] = SUNSET_DATE
# RFC 8594: Deprecation header - when the API was deprecated
response.headers["Deprecation"] = "2024-12-01T00:00:00Z"
# Link header: points to migration guide and successor
response.headers["Link"] = (
f'<{MIGRATION_GUIDE}>; rel="deprecation", '
f'</v2/documents>; rel="successor-version"'
)
# Warning header: human-readable deprecation notice
response.headers["Warning"] = (
f'299 - "API {version} is deprecated and will be removed on '
f'{SUNSET_DATE}. See {MIGRATION_GUIDE}"'
)
@router_v1.get(
"/{doc_id}",
deprecated=True, # FastAPI marks route as deprecated in OpenAPI docs
summary="[DEPRECATED] Get document - use /v2/documents instead",
)
async def get_document_v1(doc_id: str, response: Response):
add_deprecation_headers(response)
# ... existing v1 implementation
# You can also log deprecation usage to track which clients still use v1
import logging
deprecation_logger = logging.getLogger("api.deprecation")
@router_v1.post("/upload", deprecated=True)
async def upload_document_v1(response: Response, request: Request):
add_deprecation_headers(response)
# Log which client is still using the deprecated endpoint
# This helps identify teams to contact for migration
deprecation_logger.warning(
"Deprecated v1 endpoint called",
extra={
"endpoint": "/v1/documents/upload",
"client_ip": request.client.host if request.client else "unknown",
"user_agent": request.headers.get("User-Agent", "unknown"),
},
)
# ... v1 implementation
Sunset Policy and Enforcement
# upload_service/middleware/sunset_enforcement.py
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
from datetime import datetime, timezone
SUNSET_SCHEDULE = {
"/v1/": datetime(2025, 6, 1, tzinfo=timezone.utc),
"/v0/": datetime(2024, 12, 1, tzinfo=timezone.utc), # Already past → refuse requests
}
class SunsetEnforcementMiddleware(BaseHTTPMiddleware):
"""
After the sunset date, actively reject requests to deprecated versions
with a 410 Gone response pointing to the migration guide.
"""
async def dispatch(self, request, call_next):
path = request.url.path
now = datetime.now(timezone.utc)
for prefix, sunset_date in SUNSET_SCHEDULE.items():
if path.startswith(prefix) and now > sunset_date:
return JSONResponse(
status_code=410, # Gone - resource no longer exists
content={
"error_code": "API_VERSION_REMOVED",
"message": (
f"API version {prefix.strip('/')} was removed on "
f"{sunset_date.strftime('%Y-%m-%d')}. "
f"Please migrate to the current version."
),
"migration_guide": "https://docs.example.com/api/migration",
"current_version": "/v2/",
},
headers={
"Link": '<https://docs.example.com/api/migration>; rel="deprecation"',
},
)
return await call_next(request)
Part 8: Client SDK Generation
Generating a Typed Python Client from OpenAPI
# Step 1: Export OpenAPI schema from the running service
curl http://localhost:8001/openapi.json | python -m json.tool > upload-service-openapi.json
# Step 2: Generate a typed Python client using openapi-generator
docker run --rm \
-v $(pwd)/upload-service-openapi.json:/openapi.json \
-v $(pwd)/clients:/output \
openapitools/openapi-generator-cli:v7.2.0 generate \
-i /openapi.json \
-g python \
-o /output/upload-service-client \
--additional-properties \
packageName=upload_service_client,\
projectName=upload-service-client,\
packageVersion=1.2.0,\
generateSourceCodeOnly=true,\
library=asyncio
# This generates:
# clients/upload-service-client/
# ├── upload_service_client/
# │ ├── api/
# │ │ └── documents_api.py ← typed API methods
# │ ├── models/
# │ │ ├── document_response.py
# │ │ ├── document_upload_response.py
# │ │ └── error_detail.py
# │ ├── api_client.py
# │ └── configuration.py
# ├── setup.py
# └── README.md
Using the Generated Client
# In another service (Processing Service) using the generated Upload Service client
# pip install ./clients/upload-service-client
from upload_service_client import ApiClient, Configuration
from upload_service_client.api import DocumentsApi
from upload_service_client.models import DocumentUploadResponse
from upload_service_client.exceptions import ApiException
async def get_document_from_upload_service(
doc_id: str,
access_token: str,
) -> DocumentUploadResponse:
config = Configuration(
host="http://upload-service:8001",
access_token=access_token,
)
async with ApiClient(configuration=config) as api_client:
documents_api = DocumentsApi(api_client)
try:
# Fully typed - IDE knows the return type is DocumentUploadResponse
document: DocumentUploadResponse = await documents_api.get_document(doc_id=doc_id)
# IDE autocomplete works on all fields
print(f"Document ID: {document.doc_id}")
print(f"Status: {document.status}") # type: str (from Literal)
print(f"Label: {document.label}")
return document
except ApiException as exc:
if exc.status == 404:
raise DocumentNotFoundError(doc_id)
elif exc.status == 401:
raise AuthenticationError("Upload Service rejected the token")
else:
raise UploadServiceError(f"Unexpected error: {exc.status} {exc.reason}")
Automating SDK Generation in CI
# .gitlab-ci.yml - automatically regenerate and publish SDK on API changes
stages:
- test
- generate-sdk
- publish-sdk
generate-client-sdk:
stage: generate-sdk
image: openapitools/openapi-generator-cli:v7.2.0
script:
# Start the service temporarily to export the schema
- pip install -e .
- uvicorn upload_service.main:app --host 0.0.0.0 --port 8001 &
- sleep 5
- curl http://localhost:8001/openapi.json > openapi.json
- kill %1
# Generate the client
- openapi-generator-cli generate
-i openapi.json
-g python
-o clients/upload-service-client
--additional-properties packageVersion=${CI_COMMIT_TAG:-dev}
# Check if the generated client differs from the committed one
- git diff --exit-code clients/ || echo "API schema changed - review diff"
artifacts:
paths:
- clients/upload-service-client/
expire_in: 1 week
only:
- main
- tags
publish-sdk-to-pypi:
stage: publish-sdk
script:
- cd clients/upload-service-client
- pip install build twine
- python -m build
- twine upload --repository-url $PYPI_URL dist/* --username $PYPI_USER --password $PYPI_PASS
only:
- tags # Only publish on tagged releases
Interview Patterns
Q: What is the difference between a breaking and a non-breaking API change? Give three examples of each.
A: Non-breaking (additive) changes: (1) adding an optional response field with a default, (2) adding an optional request parameter with a default value, (3) adding a new endpoint. Breaking changes: (1) removing or renaming a response field - clients reading that field get null or an error; (2) making an optional request field required - existing clients that don't send it get 422; (3) changing a field's type (e.g., string to int) - typed clients get a deserialization error.
Q: Explain consumer-driven contract testing with Pact. How does it differ from integration testing?
A: In Pact, the consumer writes a test that defines exactly what request it will make and what response it expects. Pact generates a "pact file" (contract) from this test. The provider downloads the pact file and runs a verification that checks it can satisfy every interaction without the consumer present. The key difference from integration testing: you do not need both services running simultaneously. Each side tests independently - consumer against a mock provider, provider against the consumer's contract. This catches schema mismatches in CI, before deployment, without a shared test environment.
Q: A provider wants to rename a response field. How do you do this backward-compatibly without breaking consumers?
A: Three-phase migration: (1) Add the new field to the response while keeping the old field - both are present with the same value. Consumers can migrate at their own pace. (2) After all known consumers have migrated to the new field (tracked via API logs or Pact broker), add a deprecation warning to the old field. (3) After the sunset date, remove the old field. This can take weeks or months depending on how many consumers exist. Pact contract tests will automatically catch any consumer that has not migrated when the provider removes the old field.
Q: Why should readiness probes NOT check downstream services like Classification Service or Email Service?
A: Readiness probes tell Kubernetes whether to send traffic to this pod. If Upload Service's readiness probe checks Classification Service, and Classification Service is down, Upload Service pods become "not ready" and Kubernetes stops sending traffic to them. But Upload Service might still be able to accept uploads (storing them and classifying later). Checking downstream services in readiness probes creates unnecessary coupling - one service's downtime propagates to all its callers' readiness. Readiness should check only what this service owns: its database connection, its cache, its own config.
Q: What is the Tolerant Reader pattern, and how do you implement it with Pydantic?
A: The Tolerant Reader pattern states that a service should be lenient about the data it receives - accepting unknown fields without error and treating missing optional fields as having default values. In Pydantic v2, this is the default behavior: model_config = {"extra": "ignore"} silently discards unknown fields, and Optional fields with defaults handle missing data. The opposite, extra = "forbid", is appropriate for strict validation at a public API boundary but is too brittle for internal service-to-service communication where the provider may add fields at any time.
